A speaker biased SI recognizer for embedded mobile applications

نویسندگان

Yaxin Zhang

Bian Wu

Xiaolin Ren

Xin He

چکیده

Non-native and accent speakers often face problems when using a speaker-independent (SI) speech recognition system. Speaker adaptation has been a solution to make SI recognizer work better for individuals. Targeting embedded implementation and applications in fast changing mobile environments, we in this paper proposed a supervised speaker adaptation (SA) solution with low system resource consumption, minimized disturbance to the data structure of SI recognizer, and superior adaptation performance. Adapted by UK speakers on a digit recognition task, the US English speech recognizer produced 65.9% digit error reduction. Other advantages of the proposed SA method include the multi-speaker adaptation, the fast adaptation, and the little changed speaker independency after adaptation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experiments and Analysis of Speaker Dependent Mandarin Monosyllable Recognition

It is well known that the word accuracy of a speaker independent (SI) continuous speech recognition system cannot be good enough for many real-world applications due to many interference factors in speech signal: pronunciation variance by speakers, different kinds of environment noise, and so on. Thus, analyzing the action procedure of each interference factor, then eliminating its effect as po...

متن کامل

Speaking rate normalization with lattice-based context-dependent phoneme duration modeling for personalized speech recognizers on mobile devices

Voice access of cloud applications including social networks using mobile devices becomes attractive today. And personalized speech recognizers over mobile devices become feasible because most mobile devices have only a single user. Speaking rate variation is known to be an important source of performance degradation for spontaneous speech recognition. Speaking rate is speaker dependent, it cha...

متن کامل

Speech Recognition Methods and their Potential for Dialogue Systems in Mobile Environments

The DaimlerChrysler speech recognizer is specialized for robust speech recognition in noisy environments, in particular for command and control applications. The recognizer that is used in cars has fixed grammars, which restrict the speaker to using short commands. This paper presents methods that allow the user to speak more freely and add spontaneous words to the commands: language modelling,...

متن کامل

Impact of Vocal Tract Length Normalization on the Speech Recognition Performance of an English Vowel Phoneme Recognizer for the Recognition of Children Voices

Differences in human vocal tract lengths can cause inter speaker acoustic variability in speech signals spoken by different speakers for the same textual version and due to these variations, the robustness of a speaker independent (SI) speech recognition system is affected. Speaker normalization using vocal tract length normalization (VTLN) is an effective approach to reduce the affect of these...

متن کامل

Model-combination-based acoustic mapping

We propose a new method for compensating distortions in the speech signal caused by environment changes. The basic method concentrates on additive noise, but can be extended to address also channel and to some extend speaker changes. By combining compensation with adaptation techniques it leads to high error rate reductions for mobile speech applications. Thereby, it is more efficient than adap...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

A speaker biased SI recognizer for embedded mobile applications

نویسندگان

چکیده

منابع مشابه

Experiments and Analysis of Speaker Dependent Mandarin Monosyllable Recognition

Speaking rate normalization with lattice-based context-dependent phoneme duration modeling for personalized speech recognizers on mobile devices

Speech Recognition Methods and their Potential for Dialogue Systems in Mobile Environments

Impact of Vocal Tract Length Normalization on the Speech Recognition Performance of an English Vowel Phoneme Recognizer for the Recognition of Children Voices

Model-combination-based acoustic mapping

عنوان ژورنال:

اشتراک گذاری